Mining a Year of Speech

نویسندگان

  • John Coleman
  • Mark Liberman
  • Greg Kochanski
  • Lou Burnard
  • Jiahong Yuan
چکیده

The availability of large text corpora has revolutionized linguistics and is of great value in many other areas of scholarship. Our “Mining a Year of Speech” project, funded by the transatlantic “Digging into Data” competition, aims to do the same for spoken language. We present a new generation of speech corpora, characterised by aggregation of datasets, annotated using forced alignment and exposed for public use in standard formats across multiple sites.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Feature extraction in opinion mining through Persian reviews

Opinion mining deals with an analysis of user reviews for extracting their opinions, sentiments and demands in a specific area, which can play an important role in making major decisions in such area. In general, opinion mining extracts user reviews at three levels of document, sentence and feature. Opinion mining at the feature level is taken into consideration more than the other two levels d...

متن کامل

Trends in Speech and Language Rehabilitation in Iran

This paper is a short review on the Jann and content of speech and language rehabilitation services and the trend of their institutionalization in Iran. A summary of formal education in speech and language therapy in Iran as originated by establishing a 4 year BS rehabilitation program in the College of Rehabilitation Sciences in Tehran in 1974 is given. Since then, speech and language Rehabili...

متن کامل

Speech development and auditory performance in children after cochlear implantation

 Abstract Background: The aim of this study was to determine the auditory performance of congenitally deaf children and the effect of cochlear implantation (CI) on speech intelligibility. Methods: Aprospective study was undertaken on 47 children in a pediatric tertiary referral center for CI. All children were deaf prelingually and were younger than 8 years of age. They were followed up until 5...

متن کامل

Speech enhancement based on hidden Markov model using sparse code shrinkage

This paper presents a new hidden Markov model-based (HMM-based) speech enhancement framework based on the independent component analysis (ICA). We propose analytical procedures for training clean speech and noise models by the Baum re-estimation algorithm and present a Maximum a posterior (MAP) estimator based on Laplace-Gaussian (for clean speech and noise respectively) combination in the HMM ...

متن کامل

Improving the performance of MFCC for Persian robust speech recognition

The Mel Frequency cepstral coefficients are the most widely used feature in speech recognition but they are very sensitive to noise. In this paper to achieve a satisfactorily performance in Automatic Speech Recognition (ASR) applications we introduce a noise robust new set of MFCC vector estimated through following steps. First, spectral mean normalization is a pre-processing which applies to t...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2011